E-HMM approach for learning and adapting sound models for speaker indexing

نویسندگان

  • Sylvain Meignier
  • Jean-François Bonastre
  • Stéphane Igounet
چکیده

This paper presents an iterative process for blind speaker indexing based on a HMM. This process detects and adds speakers one after the other to the evolutive HMM (E-HMM). The use of this HMM approach takes advantage of the different components of AMIRAL automatic speaker recognition system (ASR system: frontend processing, learning, loglikelihood ratio computing) from LIA. The proposed solution reduces the miss detection of short utterances by exploiting all the information (detected speakers) as soon as it is available. The proposed system was tested on N-speaker segmentation task of NIST 2001 evaluation campaign. Experiments were carried out to validate the speakers detection. Moreover, these tests measure the influence of parameters used for speaker models learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Unsupervised learning of HMM topology for text-dependent speaker verification

Usually, text-dependent speaker verification can achieve better performance than text-independent system because of the constraint that the enrollment and testing utterance share the same phonetic content. However, the enrollment data for text-dependent system usually is very limited. Expectation Maximization(EM) training of HMM will suffer from noisy estimation because of limited enrollment. A...

متن کامل

Speaker adaptation experiments using nonstationary-state hidden Markov models: a MAP approach

In this paper, we report our recent work on applications of the MAP approach to estimating the time-varying polynomial Gaussian mean functions in the nonstationary-state or trended HMM. Assuming uncorrelatedness among the polynomial coefficients in the trended HMM, we have obtained analytical results for the MAP estimates of the time-varying mean and precision parameters. We have implemented a ...

متن کامل

A system for voice conversion based on adaptive filtering and line spectral frequency distance optimization for text-to-speech synthesis

This paper proposes a new voice conversion algorithm that modifies the source speaker’s speech to sound as if produced by a target speaker. To date, most approaches for speaker transformation are based on mapping functions or codebooks. We propose a linear filtering based approach to the problem of mapping the spectral parameters of one speaker to those of the other. In the proposed method, the...

متن کامل

Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition

Recently, we have proposed a novel fast adaptation method for the hybrid DNN-HMM models in speech recognition [1]. This method relies on learning an adaptation NN that is capable of transforming input speech features for a certain speaker into a more speaker independent space given a suitable speaker code. Speaker codes are learned for each speaker during adaptation. The whole multi-speaker tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001